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Abstract.  The  method  of  trilinear  aggregating  with  implicit  canceling  for  the  design  of  fast  matrix 
multiplication  (MM)  algorithms  is  revised  and  is  formally  presented  with  the  use  of  Generating 
Tables  and  of  linear  transformations  of  the  problem  of  MM.  It  is  shown  how  to  derive  the  exponent 
of  MM  below  2.67  even  without  the  use  of  approximation  algorithms. 
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1.  Introduction. 

The  attention  to  the  problem  of  fast  matrix  multiplication  hereafter  referred  to  as  MM  has 
remained  very  high  since  1968  when  V.  Strassen  proved  that  4.8 TV2*81  arithmetic  operations  rather 
than  2 TV3  suffice  to  multiply  two  N  X  N  matrices  for  all  N ,  see  [1].  (For  comparison,  a  method 
of  nonasymptotic  acceleration  of  MM  [2]  presented  in  January  1966  at  the  seminar  of  Dr.  G.  M. 
Adelson-Vclskii,  Dr.  A.  S.  Kronrod,  and  Dr.  Y.  M.  Landis  in  Moscow  has  not  been  published  because 
of  the  lack  of  interest  to  that  method  outside  the  seminar  in  1966.) 

The  attempts  to  improve  the  exponent  2.81  followed.  Smaller  exponents  could  automatically 
result  from  any  sufficiently  fast  (in  terms  of  the  number  of  nonscalar  multiplications  involved) 
bilinear  algorithm  for  a  MM  problem  of  any  specific  shape  because  of  the  possibility  to  use 
bilinear  algorithms  recursively.  (Hereafter  that  number  of  nonscalar  multiplications  is  called  the 
multiplicative  complexity  of  a  bilinear  algorithm.)  The  design  of  fast  basic  algorithms  for  the 
recursion  turned  out  to  be  a  harder  problem.  The  next  improvement  of  the  exponent  from  2.81 
to  2.7804  came  about  in  1978,  sec  [3].  The  proof  techniques  (trilinear  aggregating,  uniting  and 
canceling,  TAUC)  have  been  sketched  in  the  earlier  paper  [4].  However  the  actual  potential  power  of 
the  TAUC  has  not  been  fully  appreciated  even  in  1978.  Later  another  approach  to  the  acceleration 
of  MM  (called  the  method  of  APA-algorithms)  appeared  in  [5]  and  has  been  justified  in  [6].  This 
reduced  the  exponent  to  2.7799.  Then  the  methods  of  APA-algorithms  and  TAUC  have  been 
combined  together  which  led  to  a  more  serious  asymptotic  acceleration  of  MM,  sec  [7 -10).  On  the 
other  hand,  it  turned  out  that  the  TAUC  are  closely  related  to  the  Direct  Sum  Problem  (DSP)  of 
the  fast  evaluation  of  (the  direct  sum  of)  r  independent  sets  of  bilinear  forms,  r  >  1 .  According  to 
the  Direct  Sum  Conjecture  (DSC),  due  to  [11],  the  multiplicative  complexity  of  the  whole  problem 
(Direct  Sum  Problem)  is  equal  to  the  sum  of  multiplicative  complexities  of  the  r  independent 
problems  of  the  evaluations  of  r  given  sets.  On  the  contrary,  I, he  TAUC  successfully  exploits  the 
advantage  of  simultaneous  evaluation  of  several  independent  sets  of  matrix  products.  In  the  case 
of  APA-algorithms  the  TAUC  enables  us  to  disprove  the  DSC.  Tlu^  lirstTorrriaL counterexample ' 
to  the  DSC  over  the  class  of  APA-algorithms  appeared  in  [8,  RcmajrK,  p*  37p  although; the- DSC 1 
has  not  been  studied  in  [8].  (See  other  counterexamples  also  based  on  the  TAUC  in  [9,10].')  In  the 
case  of  usual  algorithms  the  DSP  remains  open.  This  might  be  due  J,o  out  poor  knowledge  of  the 
lower  bounds.  For  example,  no  method  is  known  for  10  X  10  MM  in  t>50  non  sc  alar  mt:1f  5  pi*  nations 
while  two  10  X  10  matrix  products  can  be  evaluated  using  the  tAuC  in  1300  mi  ^ons. 

(However  this  does  not  disprove  the  DSC  because  the  best  known  lower  bound  for  *  *  !M 

is  only  199  multiplications.)  Of  course,  the  latter  algorithm  for  the  pair  of  10  X  10  MM  can  be 

transformed  into  a  fast  algorithm  for  10  X  10  by  10  X  20  MM  in  1300  multiplications  which  can 

% 

be  applied  as  a  basis  for  the  recursion  to  derive  an  asymptotically  fast  method  for  MM.  On  the 
other  hand,  the  recursion  based  on  the  method  of  10  X  10  MM  in  650  would  result  in  a  smaller 
exponent  of  MM.  Although,  as  we  mentioned,  such  a  method  for  10  X  10  MM  might  not  exist  it 
turned  out  that  practically  the  same  exponent  can  be  obtained  as  if  it*exteted  because  the  recursion  ' 
can  also  be  used  with  an  algorithm  for  a  direct  sum  of  MM  problems  as  a  basis.  A  similar  result 
for  any  basic  algorithm  for  an  arbitrary  direct  sum  of  MM  problems  is  due  to  [10]  and  is  known 
as  the  Exponential  Direct  Sum  Theorem,  15DST;  see  [9].  It  is  worth  mentioning  that  historically 
the  earlier  techniques  of  the  TAUC  motivated  the  HDST  as  a  means  to  reinforce  the  power  of  the 
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TAUC. 

By  combining  the  new  methods  of  the  TAUC  and  APA-algorithms  with  each  other,  with  the 
EDST,  and  with  the  recursive  construction  (which  is  also  called  the  Tensor  Product  Construction 
(TPC))  smaller  exponents  of  MM  were  obtained  in  1979;  sec  [7  10].  (The  references  to  the  TAUC 
are  omitted  in  [7,  10]  but  the  reader  can  easily  notice  common  basic  elements  of  the  patterns  of 
[7,  10]  and  of  the  earlier  2-Procedure  of  the  TAUC  of  (3,  4,  12];  see  also  [8],  [9,  Section  19],  and 
[13,  Section  4].)  In  particular,  the  exponent  2.522  was  obtained  by  combining  the  construction  of 
[8]  with  the  EDST  and  was  announced  on  October  26,  1979,  at  the  Conference  on  the  Complexity 
Theory  in  Oberwolfach,  October  21  27,  1979  (see  [14])  although  only  out-of-date  2.548  appeared 
as  the  “world  record”  in  the  EATCS  report  on  that  conference  [15].  Later  improvements  in  1980 
81,  sec  [9,  10,  16,  17],  which  reduced  the  exponent  to  2.5167,  2.5161,  2.496  also  relied  on  the 
combinations  of  the  techniques  of  the  TAUC,  APA-algorithms,  HOST,  TPC,  and  on  some  new 
elements  of  the  analysis.  However  in  general  the  progress  seems  to  go  out  of  power  after  1979 
because  the  most  natural  combinations  of  that  kind  have  already  been  explored.  (So  called  Partial 
Matrix  Multiplication  technique,  see  [7],  does  not  seem  to  lead  to  a  serious  if  any  improvement 
over  the  EDST.) 

We  believe  that  the  further  progress  in  the  acceleration  of  MM  and  might  be  in  the  solution 
of  the  DSP  for  usual  algorithms  depends  on  the  success  in  the  analysis  of  the  methods  of  trilinear 
aggregating  (TA)  because  TA  constitutes  the  basis  for  the  design  of  the  fastest  MM  algorithms. 
This  paper  is  our  extensive  attempt  of  such  an  analysis.  Thus  wo  intentionally  focus  our  attention 
on  TA. 

Wc  formally  define  the  process  of  TA  by  reducing  it  to  the  design  of  Generating  Tables  which 
in  turn  are  obtained  from  certain  partitions  of  finite  sets.  Until  the  last  section  we  do  not  involve 
APA-algorithms  because  we  tend  to  simplify  the  problem  and  to  understand  how  successfully  TA 
can  work  wit  hout  them.  Our  study  shows  that  the  resulting  MM  algorithms  are  quite  fast  even  if 
APA-algorithms  are  not  used.  On  the  other  hand,  the  structure  of  our  algorithms  is  more  regular 
than  the  structure  of  the  faster  algorithms  for  MM  obtained  via  the  APA-algorithms. 

To  make  the  paper  self-contained  we  formally  state  the  problem  of  MM  and  of  the  direct 
sum  of  MM  and  prove  the  EDST  in  Sections  2  and  3.  In  our  proof  we  follow  [9]  using  Theorem 
13.1  of  [9]  as  a  basis  but  the  successful  notation  borrowed  from  [10]  helped  us  to  make  the  proof 
much  simpler.  (Formally  we  prove  the  EDST  for  usual  algorithms.  The  extension  to  the  case  of 
APA-algorithms  is  well  understood  now;  see  [6,  9,  10,  17],)  Our  proof  of  the  EDST  in  like  the 
proofs  of  [10,  17]  is  elementary  and  does  not  use  Censorial  calculus.  Also  in  Section  2  we  show  that 
the  asymptotic  complexity  of  MM  can  not  depend  on  the  choice  of  the  field  of  constants  unless  such 
a  field  is  finite.  In  Section  4  we  revisit  the  TAUC.  We  present  it  more  formally  than  we  did  earlier 
and  in  a  different  version.  The  procedures  of  trilinear  aggregating  (TA)  and  consequently  MM 
algorithms  are  defined  by  Generating  Tables  (GT).  The  resulting  algorithms  for  MM  appear  as 
decompositions  of  special  trilinear  forms  (associated  with  the  given  problems  of  MM)  into  sums  of 
aggregates  anil  correction  terms  obtained  from  the  Generating  Tables.  The  total  number  of  terms 
equals  the  multiplicative  complexity  of  the  algorithms  and  consequently  defines  the  exponents  of 
MM.  Hence  our  objective  is  the  reduction  of  the  total  number  of  terms  and,  in  particular,  of  the 
number  of  the  correction  terms  because  the  aggregates  are  not  numerous. 
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In  Section  5  we  rewrite  the  GTs  so  that  the  design  of  algorithms  for  larger  problems  of  MM 
appears  in  a  more  explicit  fashion  than  in  the  cases  where  it  is  defined  by  the  recursive  process  that 
starts  with  the  algorithms  for  small  MM  problems.  Also  we  define  the  degree  and  the  dimension 
of  correction  terms  of  a  Generating  Table  and  show  why  it  is  desirable  that  all  of  or  most  of  the 
correction  terms  have  degree  1.  In  Section  6  we  show  that  the  latter  property  follows  if  the  GTs 
arc  defined  by  some  appropriate  partitions  of  the  finite  sets.  We  give  two  examples  of  the  GTs  (the 
First  and  the  Second  Constructions  of  Section  6)  where  we  demonstrate  which  properties  of  the 
partitions  arc  to  be  exploited.  In  Section  7  we  describe  the  method  of  Implicit  Canceling  (IC)  of 
correction  terms  of  degree  1;  see  [13],  to  be  combined  with  TA  to  define  Trill  near  Aggregating  with 
Implicit  Canceling  (TAIC).  TA1C  is  a  modification  of  TAUC.  It  provides  us  with  an  insight  into 
the  techniques  of  the  design  of  fast  MM  algorithms.  Combining  TAIC  with  the  First  Construction 
of  Section  6  gives  us  a  quite  regular  and  homogeneous  algorithm  that  evaluates  (the  direct  sum  of) 
(2 u)\/(u\)2  independent  products  of  nu  X  n2u  by  n2u  X  nu  matrices  in  (n-h  \)Au  multiplicative  steps 
for  arbitrary  natural  n  and  u.  This  defines  the  exponents  less  than  2.67  without  the  use  of  auxiliary 
APA-algorithms.  (The  best  previous  result  of  that  kind  was  2.773... ;  sec  [13].)  Combining  TAIC 
with  the  Second  Construction  of  Section  6  gives  a  similar  method  for  the  direct  sum  of  (3t>)!/(v!)3 
independent  problems  of  (n  —  l)3u  X  (n  —  l)3v  MM  involving  (n  +  l)9v  multiplicative  steps  for 
arbitrary  natural  n  and  v .  This  defines  the  exponents  less  than  2.7288  (also  without  the  use  of 
APA-algorithms.)  Technically  the  latter  algorithm  involves  TAUC  and  a  method  of  Alternating 
Summation  of  Aggregates  which  is  used  to  cancel  the  terms  of  positive  codimensions.  Finally 
in  Section  8  we  sketch  the  possible  generalizations  of  our  approach.  This  includes  the  study  of 
the  partitions  of  finite  sets  for  GTs  (with  the  First  and  Second  Constructions  of  Section  6  as 
the  models)  and  of  the  Generating  X-Tables.  In  the  latter  case  the  indelcrmi nates  appear  in  the 
GTs  with  some  constant  coefficients  which  may  depend  on  a  parameter  X.  This  case  incorporates 
TAUC  with  a  special  Canceling  Procedure  (see  [3,  12])  and  the  design  of  APA-algorithms  which 
are  sometimes  also  called  X-algorithms  (sec  [8,  9,  17]). 

We  hope  that  our  analysis  will  help  the  reader  to  understand  the  principles  of  trilincar 
aggregating  (which  we  consider  the  basic  technique  for  fast  MM)  and  finally  will  lead  to  a  new 
acceleration  of  MM  in  the  future. 


2.  Some  Basic  Notions,  Basic  Notation,  Basic  Construction* 

Hereafter  ti**  =  designates  the  (i,k)  entry  of  a  matrix  (/,  LL  designates  a  vector  of  all 

entries  of  U  taken  in  a  fixed  order,  TV  U  ==  Ylt  uU  the  trace  of  U .  /,  ./,  K  arc  given  natural 
numbers,  i,  jf,  k  are  integer  parameters. 

Definition  2.1*  (/,./,  K)  the  problem  of  MM.  Given  a  field  (of  constants)  F}  an  /  X  •/  matrix 
X ,  and  a  J  X  K  matrix  Y  whose  entries  are  indeterrninates.  Evaluate  (the  entries  of)  the  product 
XY  by  a  straight  line  arithmetic  algorithm  using  the  constants  from  b\ 

(f,J,K)  is  an  example  of  a  bilinear  arithmetic  computational  problem  that  is  the  problem 
of  the  evaluation  of  a  given  set  of  bilinear  forms,  B.  In  the  ease  of  (/,./,  K)t  B  is  the  set  of  the 
entries  of  XY  which  arc  bilinear  forms  of  the  entries  of  X f  Y , 

i 


* 


In  general,  a  bilinear  problem  can  be  equivalently  represented  by  a  set  or  bilinear  forms, 

B  =  {«„(£>  £)},  (2.1) 


by  a  trilinear  form 


T  =  T(X ,  Y,  £)  =  Yi  'MX>  &  > 

V 

(2.2) 

or  by  a  tensor  t  —  (£Ml/T?)  of  the  coefficients  of  T ;  see  [4,  18],  for  surveys  on  bilinear  problems  and 
algorithms,  see  [19  23]. 

In  the  case  of  {/,«/, /C), 

r  =  Yt{xyz)  =  Y.XiiVjkZki- 

(2.3) 

Here  X,  V"  are  given  matrices  to  be  evaluated  (see  Definition  2.1)  and  Z  =  (zki)  is  the  (auxiliary) 
K  X  /  matrix  whose  entries  are  in  determinates. 

As  another  example  of  bilinear  problems  we  mention  polynomial  multiplication  (PM)  also 
known  as  convolution  of  vectors  (sec  [21,  23]).  PM  is  defined  by  the  following  trilinear  form, 

p-l 

r=J2  JlXi  Zi+j  ■ 

i  — 0  3  — 0 

(2.4) 

Bilinear  algorithms  for  bilinear  problems  can  he  equivalently  represented 
bilinear,  trilinear  or  tensorial  identical  decompositions. 

as  the  following 

M 

V*,:  BV(X,Y)  =  £  fqr,  lJqUQ  lJq(Y)  , 

9=1 

(2.5) 

M 

T(X,  Y,/)  =  MX)  W  W . 

9-1 

(2.6) 

M 

tfii/r)  =  ^  ^  fq^fqi/fqrf  f°r  Ab  V  • 

9  =  1 

(2.7) 

Here 

pi/  rj 

for  all  (?,  /x,  t/,  77 . 

(2.8) 

(2.9) 

Hereafter  the  reader  may  identify  a  bilinear  algorithm  with  either  of  its  three  representations 
but  actually  the  evaluation  of  8  proceeds  by  first  computing  the  M  products  7T,(X,y)  =  Ijq(X)l'fq(Y_) 
for  all  <7,  and  then  computing  /in(X»H)  =  fqr)nq{%->  H)  for  all  r /.  Hereafter  M  is  called  the 

rank  of  a  bilinear  algorithm , 

In  the  case  of  MM  the  subscripts  /x,  v,  and  t)  are  represented  by  the  pairs  of  (i,  j),  {j,k)9  and 
(kf t)  respectively  (for  example,  in  such  a  case  yu  =  fqv  =  Iqk\)- 


We  will  refer  to  the  tensorial  representation  (2.7)  in  Remark  2.1  but  otherwise  the  reader  may 
skip  (2.7).  In  fact,  we  presented  the  tensorial  representation  only  for  the  sake  of  completeness 
because  of  its  wide  use  in  the  literature  on  MM.  Furthermore  we  will  need  only  the  trilinear 
representation  after  Section  3. 

The  equivalence  of  (2.5),  (2.6)  and  (2.7)  is  easily  verified.  For  instance,  for  the  transition  from 
(2.6)  to  (2.5)  equate  the  coefficients  of  each  indeterminate  zv  in  the  left  and  right  sides  of  (2.6). 
Equating  the  coefficients  of  all  or  of  all  yu  rather  than  zv  we  obtain  the  two  (dual)  bilinear 
algorithms  of  the  same  rank  M  for  the  two  dual  bilinear  problems  {/?M(X,  /)}  and  {Bu(Z_y  X)}. 

For  example,  if  the  original  algorithm  of  rank  M  solves  {/,  JyK)  then  the  dual  ones  solve 
(J,KyI)  and  (/<",./,/)  and  have  the  same  rank,  M.  In  fact,  such  algorithms  can  be  also  trans¬ 
formed  into  ones  of  the  same  rank,  M,  for  the  problems  (J,  /,  K),  (/,  K ,  J),  and  (K ,  J,  /).  (Indeed, 
substitute  Uji ,  Vik ,  Wkj  for  Zki,  and  yjk  respectively  in  (2.3)  and  (2.6).)  The  study  of  the 
asymptotical  time-complexity  of  bilinear  algorithms  for  MM  relics  on  the  next  definition  and 
theorem. 

Definition  2.2.  ft  =  ft(F)  is  an  exponent  of  MM  (over  F)  if  there  exists  a  positive  constant 
c  =  c(ft)  associated  with  that  exponent  ft  such  that  cN&  arithmetic  operations  are  sufficient  to 
solve  {N ,  /V,  N)  for  all  /V  by  straight  line  algorithms  (with  the  constants  from  F).  ft*  is  a  limiting 
exponent  of  MM  if  for  all  c  >  0,  ft*  4-  e  is  an  exponent  of  MM. 

Theorem  2.1;  see  (ij.  If  for  some  natural  numbers  f ,  J ,  K ,  M  there  exists  a  bilinear  algorithm 
(2.5)  (2.9)  of  rank  M  for  { l,J,K )  then  ft  =  3  log  M  /  log(/,7  K)  is  an  exponent  of  MM. 

Outline  of  Proof.  The  basic  observation  for  the  proof  is  that  in  the  case  of  MM  the  identities 
(2.5)  (2.9)  remain  true  if  the  entries  of  X,  X,  are  replaced  by  the  V  X  Jf ,  J'  X  Kf  and  Kt  X  /' 
matrices  respectively  (for  arbitrary  /',  J',  Kf).  Then  Fq(X_),  /^(X),  A"(X)  for  all  q  are  also 
V  X  J* ,  Jr  X  Kf  and  K 9  X  V  matrices  respectively  and  Tr (XYZ)  represents  {//',  JJf,  KK*).  If 
f  =  J  =  K  we  write  /'  =  J9  =  K*  =  /  and  apply  the  original  algorithm  to  multiply  bq[X-)  by 
I'g(Y-)  for  all  q.  This  defines  the  transition  from  a  bilinear  algorithm  of  rank  M  for  (/,  /,/)  to  the 
one  of  rank  M2  for  (Z2,/2,/2).  Continuing  this  process  and  counting  the  number  of  arithmetic 
operations  we  obtain  the  desired  upper  bound  in  the  cases  AT  =  Ih  for  all  h  and  then  easily 
extend  the  bound  to  the  case  of  arbitrary  N .  If  (/,  J,K)  is  an  arbitrary  triplet  we  come  back  to 
the  square  MM  by  writing  /'  =  ./,  J 9  =  /C,  K*  =  /  and  then  V  =  Ky  Jf  =  /,  I V  =  J  for 
the  first  two  recursive  steps.  This  gives  an  algorithm  of  rank  M3  for  the  square  MM  problem, 
{ UKJJKJJK ).  | 

The  proof  of  Theorem  2.1  is  constructive.  The  coefficients  of  the  resulting  bilinear  algorithm 
for  {N,N,1V}  can  be  explicitly  expressed  through  the  coefficient  of  the  original  one  given  for 
(/,</,  K). 

Remark  2.1.  More  precisely,  the  tensor  of  the  coefficients  of  the  resulting  algorithm  is  the 
tensorial  power  of  the  tensor  of  the  coefficients  of  the  original  algorithm  \f(  =  J  =  K  If 
/,  ./,  K  arc  arbitrary,  the  former  tensor  is  the  tensorial  power  of  the  tensor  of  the  algorithm 


6 


for  ( IJK)IJK,IJK ).  The  latter  tensor  is  the  product  of  the  three  tensors  of  the  three  dual 
algorithms  that  include  the  original  one.  We  will  not  use  this  easily  verified  fact  but  we  will  apply 
the  name  Tensor  Product  Construction  (TPC)  to  the  recursive  process  of  the  proof  of  Theorem  2.1. 

Theorem  2.1  leads  to  the  problem  of  the  design  of  bilinear  algorithms  for  ( l,J,K )  where 
log  Mj  log(/\7 K)  is  as  small  as  possible.  Before  involving  ourselves  with  that  main  problem  we 
would  like  to  warn  the  reader  that  we  do  not  mean  to  define  the  smallest  exponent  of  MM  in 
this  way.  To  be  formal,  we  introduce  the  following  definition  which  will  also  be  used  in  the  next 
sections. 

Definition  2.3.  Let  a  bilinear  arithmetic  computational  problem  be  defined  by  a  set  of  bilinear 
forms  S,  or  by  a  trilinear  form  T(X_f  Y_}  Z),  or  by  its  tensor  t.  Then  p(B)  =  p(T)  =  p(t)f  the 
rank  of  the  problem,  of  its  tensor  t ,  and  of  the  trilinear  form  7 '(JC,Y,Z)  is  the  minimum  rank 
of  all  bilinear  algorithms  that  solve  this  problem.  For  arbitrary  natural  numbers  lt  J ,  K ,  the 
rank  of  ([,  J,  K)  is  designated  by  />({/,  J,  K)).  (The  rank  may  depend  on  the  choice  of  the  field  of 
constants  F  so  that  strictly  speaking  we  have  to  write  pF  rather  than  p.  Usually  we  will  omit  the 
subscript  F  assuming  that  F  is  fixed;  sec  also  Theorem  2.3  below.) 

Using  the  tensor  product  construction  we  obtain  (p((f,  J,  K)])h  >  p({Iht  Jh,  Kk))  for  all 
natural  h .  On  the  other  hand,  it  is  known  (see  [2 i,  25])  that 

1})  =  /•/,  p((f,JJ<))  >  (/-  i)(J+\)  +  JK  if  K  >  I.  (2.10) 

In  particular,  p({ 2,2,2))  >  7  and  in  fact,  p((2,2,2))  =  7,  see  [ij.  If  we  choose  /  =  J  =  K  =  2 
and  apply  Theorem  2.1  then  we  only  obtain  the  estimate  p((2h ,2h ,2h))  <  7h  while  it  is  known 
that  p((2k ,2h ,2h))  <  7h  for  all  h  >  5;  sec  [9].  Combining  the  two  techniques  based  on  the  concept 
of  APA-algorithms  (see  [5,  G])  and  on  the  2-Procedure  of  trilinear  aggregating  (see  [3,  9,  12])  it 

is  easy  to  prove  more  general  results  of  this  kind;  see  [17]  and  compare  [13], 

Theorem  2.2.  For  arbitrary  l,  J ,  K,  p((lfJ,K))h  >  p((lh,  Jh >  Kh))  for  all  sufficiently  large  h. 

Notice  that  Theorem  2.2  does  not  lead  to  any  improvement  of  the  lower  bounds  (2.10).  The 
meaning  of  Theorem  2.2  is  that  any  given  exponent  of  MM  associated  with  constant  c  =  (  can  be 
further  reduced.  It  is  not  clear  if  there  exists  the  minimum  exponent  of  MM.  (/?  =  2  could  be  a 
candidate.)  However  certainly  the  asymptotic  arithmetic  complexity  of  MM  can  bo  represented  by 
0*nin  ^  Pmt AK)f  smallest  limiting  exponent  of  MM  which  is,  of  cour.se,  unique  if  the  field  of 
constants  F  is  given.  Moreover,  it  is  easy  to  prove  a  stronger  statement  on  the  uniqueness. 

Theorem  2.3.  The  smallest  limiting  exponent  of  MM  over  F  does  not  depend  on  the  choice  of 
an  infinite  field  of  constants  F  so  that  for  any  infinite  field  F 

=  KM)  =  KM) 

where  Q,  C  are  the  fields  of  rational  and  cotnplcx  numbers  respectively. 
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r 


I 


[  • 


Proof 1  It  is  known  that  any  infinite  field  is  isomorphic  to  an  infinite  subfield  of  C  (and  such  a 
subficld  always  contains  Q).  Thus  we  can  assume  that  all  constants  from  F  are  complex  numbers. 
Then  for  arbitrary  e  >  0  there  exist  integers  I  =  1(e),  J  =  /(c),  K  =  K(c)  such  that 

\ogPp((I,J,K))/\og(IJK)  <  P'mia(F)  +  e.  (2.11) 

As  is  easy  to  verify  (see  [4]),  the  existence  of  a  bilinear  algorithm  for  (I,  J ,  K)  of  a  fixed  rank 
M,  in  particular  of  the  rank  M  =  pp((l,J,K))  is  equivalent  to  the  existence  of  a  solution  of  a 
system  of  algebraic  equations  with  coefficients  0  and  1.  It  follows  that 

Pf((I,J,K))  =  Pe((I,J,K))  (2.12) 

where  E  —  E(Q)  is  an  algebraic  extension  of  Q.  (2.11)  and  (2.12)  imply  that  /3Vn(/'")  +  c  is  an 
exponent  of  MM  over  E  so  that 

^m(«)</9»ta(r)  +  €.  (2.13) 

Theorem  2.3  follows  from  (2.13)  for  e  — ►  0  if  we  recall  that 


Pmin(V)  =  Pmmin(Q) ; 


sec,  for  instance,  (9,  Theorem  3.2]. 


Throughout  the  paper  our  results  do  not  depend  on  the  choice  of  F  unless  it  is  stated  otherwise. 


3.  The  Direct  Sum  of  Problems  and  the  Direct  Sum  Problem.  Tensor  Product 
Construction  for  Direct  Sums. 

In  this  section  we  generalize  Theorem  2.1  and  apply  it  to  the  case  where  several  independent 
matrix  products  arc  to  be  evaluated.  We  will  define  this  problem  as  a  particular  case  of  direct  sum 
of  r  bilinear  problems. 

Definition  3.1.  Given  a  field  F  of  constants  and  r  sets  of  bilinear  forms  S*1),...,  8 ^  such  that 

,  s  —  1, ... ,r ,  (3.1) 

x  =  (£(l\...,£(r)),  r  =  (r(,),...,i:(r>),  (3.2) 

and  the  entries  of  the  vectors  X_,  Y_  are  indclerminates.  (The  latter  condition  implies  that  the  sets 

flW . are  disjoint,  that  is,  the  sets  of  their  input  variables  arc  independent  each  of  others.) 

The  problem  of  simultaneous  evaluation  of  the  set  B(r)}  over  F  is  called  the  direct  sum 

of  the  r  bilinear  problems  8^l\  . . . ,  8 ^  and  is  designated  by 


8  =  £©B(,). 
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In  particular,  if  r  products  of  /(s)  X  J(s)  by  J(s)  X  /f(s)  matrices  Jf  ^  and 

yW  respectively  arc  to  be  evaluated  over  F  for  s  =  and  the  entries  of  all  matrices 

yW  are  indeterminates  then  such  a  direct  sum  of  r  problems  of  MM  is  designated  by 

The  direct  sum  of  r  problems,  S  (see  (3.1)  (3.3))  can  be  equivalently  represented  by  the  set 
of  bilinear  forms,  by  the  tensor  of  their  coefficients,  and  by  the  following  trilinear 

form, 


T(X,  Y,Z)  =  J2  T(s)(£(a).  £(3).  £W) . 

$ 

(3.4) 

r(»)(2£(a),nW,£(a))  =  '^2n^)(x.ls),Y(a)  )4a)* 

(3.5) 

z(s)  =  (4a)),  z  =  (z{l),...,z{a)) 

(3.6) 

are  vectors  of  indeterminates,  and  T^a\X}3\  Y}3\  Z_^)  arc  trilinear  problems  that  define  the 
bilinear  problems  B see  (3.1)  (3.3). 

In  the  case  of  ®(/(s),  J(s),  K(s)), 

r 

T(X,  Y,  Z)  =  Tr (X{a)Y{a)^a))  (3.7) 

3=  I 


where  is  the  K{h)  X  f(s)  matrix  whose  entries  are  indeterminates,  s  =  l,...,r. 

As  is  obvious,  the  solution  of  an  arbitrary  direct  sum  of  r  bilinear  problems  can  be  obtained  if 
each  of  the  r  problems  is  solved  independently  of  other  r  ~  1  ones.  Such  a  solution  is  represented 
by  the  following  r  decompositions, 


M{a) 

rM(XW  /W)  =  £  /,,s(£(3))  l/„a(L{,))  L"qa(£a))  for  8  =  1 . r  .  (3.8) 

q—\ 

An  algorithm  defined  by  (3.8)  is  called  a  direct  sum  algorithm  ami  has  rank  M  =  £^-,1  A/(s). 
However  we  might  hope  to  take  advantage  by  solving  the  r  problems  simultaneously.  Such  a 
solution  is  defined  by  the  more  general  decomposition,  (2.6)  and  consequently  gives  (bilinear) 
algorithm  of  a  more  general  class. 

In  the  case  of  direct  sums  of  several  bilinear  problems,  the  Lq()C)t  £/^(K),  !%{&)  in  (2.6)  can 
be  defined  by  the  following  identities  (rather  than  by  (2.8),  (2.9)). 


V*:  M£)  =  E/< 


(*) 


'4(u  =  SW’i 


= £/: 


2(i) 
*«i»  «r  1 


v,» 


(3.9) 


/«•  e  F  for  atl  9'  8  •  (3l°) 

(On  the  other  hand,  (3.9),  (3.10)  can  be  represented  as  a  particular  case  of  (2.8),  (2.9).) 

Again  in  the  case  of  MM,  p,  v,  ij  are  defined  by  the  pairs  (i,j),  (j,  k),  and  (k,  t)  respectively. 
Notice  that  the  /*,  u,  r)  (and  in  the  case  of  MM  also  the  i,  j,  k)  range  in  the  domains  that  depend 
on  ». 
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Now  the  problem  arises  if  there  exist  algorithms  (2.6),  (3.9),  (3.10)  that  arc  indeed  faster  than 
the  best  direct  sum  algorithms  (3.8)?  In  particular,  does  there  exist  r  disjoint  bilinear  problems 
B^x\  . . . ,  B(r)  such  that 


/e©b(s))<  x>(s(s))? 

S=1  '  3=1 


(3.11) 


The  latter  problem  is  called  the  Direct  Sum  Problem  (DSP),  The  Direct  Sum  Conjecture  (DSC)  is 
that  (3.11)  never  holds.  We  are  interested  in  the  DSP  and  DSC  for  the  class  of  MM  algorithms. 

Let  us  assume  for  a  while  that  the  DSC  for  MM  is  true.  Then  Theorem  2.1  can  be  generalized 
in  the  following  straightforward  manner. 

Proposition  3.1.  Given  a  bilinear  algorithm  (2.6),  (3.9),  (3.10)  of  rank  M  for  the  direct  sum 
of  r  disjoint  problems  of  MM,  (/(.s),  J(,s),  ($)),  s  =  l,...,r  where  M,  r,  I(s)i  J(s),  K(s)  for 

s  =  l,...,r  arc  arbitrary,  M  >  r.  Let  r  —  r*  be  the  real  solution  to  the  following  equation, 


(3.12) 


Then  the  DSC  implies  that  j3*  =  3 r*  is  an  exponent  of  MM. 

Definition  3.2.  The  equation  (3.12)  is  called  the  equation  associated  with  a  bilinear  algorithm  of 
rank  M  for  ©{/(«),  J(s),  K(s)). 

Proof.  Let  real  r(s)  satisfy  the  following  equations 

p((rs),  J(s),  K(s)))  =  (/(.s)./(8)/C(8))t(s)  (3.13) 

where  s  =  l,...,r.  Using  the  DSC  we  obtain 

E  *(/(•).  JW<  K(g))  =  p(E  ©('(*)>  J(*)’  *(*)>)  <  M.  (3 .14) 

9=1  ^S=\  ' 


Combining  (3.13)  and  (3.14)  gives 


m  >  E  >  E(/(s)J(*)/<r(-,))Tn',n 


(3.15) 


where  rmjn  ==  minar(s).  Ily  virtue  of  Theorem  2.1,  3r(s)  for  all  s  and  hence  3 rmin  are  exponents 
of  MM.  Comparing  (3.12)  and  (3.15)  gives  rmin  <  r*.  | 

Proposition  3.1  motivates  Definition  3.2,  but  we  could  apply  that  Proposition  only  if  the  DSC 
is  proven  to  be  true  for  MM.  This  is  still  an  open  problem  (see  the  Introduction). 

Fortunately  a  generalization  of  the  Tensor  Product  Construction  (TPC)  enables  us  to  save  the 
most  essential  part  of  the  result  of  Proposition  3.1. 

Theorem  3.1  (Exponential  Direct  Sum  Theorem,  EDST).  Under  the  conditions  of  Proposb 
lion  3./,  the  p*  =  3r*  is  a  limiting  exponent  of  MM  (even  if  the  DSC  is  false). 


To  prove  Theorem  3.1  we  first  generalize  the  TPC. 

Hereafter  we  designate 

r 

rQ(r,J,K)=  r  r'  O  {[ ,  J ,  K)  =  r  Q  r'  Q  {[,./,  K)  (3.16) 

3=1 

for  arbitrary  natural  r,  r',  f,  J,  K . 

Using  this  notation  we  represent  a  bilinear  algorithm  (2.6),  (3. 4)  (3.7)  as  the  following  mapping, 

r 

X  ©(/(«),  J  (*),  K(s))  *-W0(l,  1, 1) .  (3.17) 

3=  1 

The  right  side  of  (3.17)  represents  the  right  side  of  (2.6)  where  each  product  />9(X)//fl(H)^(^) 
is  represented  as  (l,  I,  1). 

We  recall  the  basic  observation  of  the  proof  of  Theocm  2.1  (which  has  led  us  to  the  TIJC)  that 
the  substitution  of  /  X  J  X  K  and  K  X  l  matrices  for  the  entries  of  X ,  K,  Z_  respectively 
preserves  (2.6).  Such  a  substitution  turns  the  direct  surn  of  the  left  side  of  (3.17)  into  the  direct 
sum  Also  it  turns  each  of  the  products  !jq(X_)lfq(Y_)l/*(Z)  into  the 

product  of  /  X  J  by  J  X  K  by  K  X  /  matrices.  Hence  the  substitution  turns  (3.17)  into  an 
algorithm  that  can  be  represented  by  the  following  mapping, 

r 

X  ©{/(«)/,  A»V,  K{»)K)  -  MQ{l,J,K).  (3.18) 

a—  1 

We  will  state  the  latter  result  formally  as  Lemma  3.2  using  the  following  definition. 

Definition  3.3.  A  mapping  B  B*  is  va lid  if  there  exists  a  bilinear  algorithm  that  is  represented 
by  such  a  mapping.  Then  we  write  B  B'.  (In  this  paper  we  use  the  notation  B  :  B'  mostly 
in  the  cases  where  =  M0(  1,1,1).) 

Lemma  3.2.  IT  (3.17)  is  valid  then  (3.18)  is  valid. 

ICquation  (3.18)  can  he  interpreted  as  the  product  of  (3.17)  by  the  trivial  mapping 

for  aribtrary  natural  A,  J,  K . 

Similarly  we  can  define  the  valid  trivial  mapping 

r' 

X  ©{//.  A,  K't)  -  X  ©(''<  A,  K't) 

*=  i  e=\ 

for  arbitrary  natural  r\  Jftf  K A  = 

Multiplying  (3.17)  and  (3.20)  we  obtain  the  following  mapping, 

r  rf  rf 

X  E eV'i'oAA'K'K't) -w©x (DC t, a, i<t)- 

<=i  <= i 


(3.19) 


(3.20) 


(3.21) 


II 


The  meaning  of  the  direct  sum  in  the  left  side  is  obvious.  The  M  terms  of  the  direct  sum  in  the 
right  side  of  (3.21)  represent  the  M  sets  each  consisting  of  r'  products  Lgt(X}^)I'qi(¥}^)L“t(Z}^)t 
1=  1  .....r',  </=  1  where  £(/),  YW,  Z(t)  arc  I't  X  J't,  J't  X  K't,  and  K't  X  l't  matrices 

respectively. 

To  justify  the  validity  of  (3.21)  (assuming  the  validity  of  (3.17)),  apply  Lemma  3.2  for  /  =  /£, 
J  =  J*,  K  =  K'if  t  =  \ Then  apply  the  following  simple  lemma. 

Lemma  3.3.  8  :  8 *  and  8  8*  imply 

B  ©  B  B'  ®  b' . 

We  have  proven  the  following  generalization  of  Lemma  3.2  and  of  the  basic  observation  for 
the  tensor  product  construction. 

Lemma  3.4.  If  mapping  (3.17)  is  valid  then  mapping  (3.21)  is  valid . 

We  also  need  the  two  following  simple  lemmas. 

Lemma  3.5.  8  :  £'  and  8 ;  :  B,f  imply  8  B" . 


Lemma  3.6.  The  mapping 

r 

t'(q)  O  (i(n),J(q),  K(q))  -  £  t(s)  o  ('(••>).  As),  k(«)) 

0=  1 

is  valid  for  arbitrary  natural  q,  P( q ),  r,  f(s),  /(*),  ./(.s),  ff(s),  s  =  l,...,r  if  1  <q<r,  /"fa)  < 

t{q). 

Now  we  are  ready  to  prove  the  main  lemma  of  this  section. 

Lemma  3.7.  Given  arbitrary  natural  numbers  l,  I ,  J ,  K ,  r .  Let 

rQ(l,J,K)^:erQ{  1,1,1).  (.3.22) 

Then  the  mappings 

rQ(lh,Jh,Kh)^  ehrQ(\,l,[)  (3.23) 

arc  valid  for  h  =  l,  2, 3,  ...  . 

Proof  (by  induction  in  h).  Let  (3.23)  be  valid  Tor  h  =  h* .  Then  by  virtue  of  Lemma  3.2, 

r©  {lh'+l,Jh'+i,Kk'+t)  th‘  G>{rQ(I,J,K)).  (3.23) 


(See  the  notation  of  (3.16).)  Applying  Lemma  3.5  to  (3.22)  and  (3.24 )  we  obtain  that  (3.23)  is  valid 
for  h  =  h*  +  1.  Observe  that  (3.23)  for  h  =  1  is  the  given  valid  mapping  (3.22).  | 


Next  we  restate  Theorem  3.1  in  the  following  obviously  equivalent  Torm  and  then  prove  it. 

Theorem  3.1.  Let  for  some  natural  numbers  M,  r,  1(8),  J(s),  K(s),  s  =  r  <  M ,  the 

mapping  (3. 17)  be  valid  and  t  —  r*  be  the  real  solution  of  the  associated  equation  ( 3A2 ).  Then 
ft*  =  3 t*  is  a  limiting  exponent  of  MM. 


Proof.  Observe  that  Theorem  2.1  and  Lemmas  3,5  3.7  imply  Theorem  3.1  in  the  case  where 
the  valid  basic  mapping  (3.17)  takes  the  form  (3.22).  (Indeed,  consider  valid  mapping  (3.23)  where 
h  is  sufficiently  large,  apply  Lemmas  3.5,  3.6  in  order  to  delete  r  in  the  lefL  side,  and  then  apply 
Theorem  2.1.) 

Finally  consider  the  general  case  of  arbitrary  valid  basic  mapping  (3.17).  Recursively  applying 
Lemmas  3.1,  3.5  to  (3.17)  wc  obtain  the  following  sequence  of  valid  mappings  for  h  =  1,2,  3,  ...  , 

53  ©c(a)  O  </(«),  •/(«),/<(«)}  0(1, 1,1).  (3.25) 

a€Q(h,r) 

Here  Q(h,r)  is  the  set  of  r-dimensional  vectors  a  =  (nq,...,ar)  with  nonnegative  integer 
entries  at, . . . ,  ar  such  that 

at  +  •  •  •  +  otr  =  h ,  (3.26) 


c(«)  = 


h\ 


at!  at2\  . . .  ar! 


%)  =  n  w)a‘  -  =  n  (jw)a’  >  =  n  (*  wr  • 

S=1  5=1  5=1 


(3.27) 

(3.28) 


Mapping  (3.25)  (3.28)  can  be  considered  the  h-th  power  of  (3.17).  We  used  the  well  known  formula 
of  multinomial  expansion  to  represent  the  terms  in  the  left  side  of  (3.25).  The  mapping  (3.17) 
coincides  with  the  mapping  (3.25)  (3.28)  for  h  =  1. 

Simultaneously  with  the  sequence  of  mappings  (3.25)  (3.28)  wc  define  the  folowing  sequence 
of  the  associated  equations  in  r. 


53  c(a)(l(a)J(Q.)K(a)Y  =  M\  h  =  1, 2, 3 .  (3.29) 

We  observe  that  for  all  r  and  for  all  h 

53  «(<?)(/(«)  =  (53(,w*/i,,m,))t 

a€Q(h,r)  ^5=1 


It  follows  that  the  equations  (3.29)  have  the  same  (real)  solution  for  all  h  which  coincide  with 
the  solution  r  =  r*  of  the  equation  (3.12). 

Let  a*(A)  be  a  vector  from  Q(htr)  such  that 

«(«*)( Ha)J(a')K(a')Y  =  max  c(«)( l(n)  ,1  (ft)  l< ((»))T  >  Mh/\Q(h,r)\  (3.30) 

aCQ(fc,0 
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where  |Q(/i,r)|  =  (r  +  is  the  cardinality  of  the  set  Q(h,r). 

As  follows  from  the  validity  of  mapping  (3.25)-(3.28)  and  from  Lemmas  3.5  and  3.6, 

C  O  (/(«*),  K(&n)  Mk  O  (l,  1, 1} 

for  all  c  <  c(a*).  We  choose  c  =  M9  where  g  is  the  natural  number  such  that  M9  <  c(a*)  < 
M9+i .  Then  we  come  to  a  valid  mapping  which  can  be  represented  in  the  form  (3.22).  Hence  the 
real  solution  r  =  r(/i)  to  the  associated  equation 


M9([(q*)  J( a*)  K{a’))r  =  Mh  (3.31) 

is  a  limiting  exponent  of  MM. 

On  the  other  hand,  since  the  cardinality  of  Q(h,r)  is  equal  to  (r  +  h)\/h\}  (3.29),  (3.30)  imply 
the  next  relations, 


> 


7TT Mi  X!  c(a)(/(a)J(a)  A'(«))r  =  Mh 

1  'r  >'  S*eQ{h,r) 


hi 

(r  +  h)\  ' 


Since  M9  >  c(am)/M  and  since  (/(a*)./^*)/^*))'  >  M(^|h)!  for  all  c  >  0  and  for  all  sufliciently 
large  h  (sec  (3.27),  (3.30)  and  recall  that  M  >  r),  it  follows  that  for  arbitrary  c  >  0 

M9(l(a,)J(a')K(am))T‘+t  >  Mh  (3.32) 

if  h  =  h[e)  is  chosen  sufficiently  large. 

Comparing  (3.31)  for  r  =  r(h)  and  (3.32)  wo  obtain  for  arbitrary  e  >  0 

T*  +  C  >  T(/l(c))  . 


Hence  r*  +  c  is  an  exponent  of  MM  for  any  e  >  0.  | 


4.  Trilinear  Aggregating  Generated  by  Tables. 

In  this  section  we  introduce  the  techniques  of  trilinear  aggregating,  TA,  in  new  modified 
versions  and  describe  the  method  in  a  more  formal  and  more  general  way  than  we  did  earlier.  We 
start  with  an  illustrative  example  of  TA. 

Example  4.1.  (2-Procedure.) 

Tr(XVZ)  4-  Tr((/KW)  =  +  u}k)(y]k  +  vki){zki  4-  wl3)  -  £ x.y  +  vkt)wtJ 

-  X!  Ujkyjk  +  wa)  -  X](X*J  +  “/*)  )VkiZki  . 

}.k  i  k,i  V  i  * 

To  simplify  the  formula  wo  have  slightly  deviated  from  our  previous  notation  writing  X ,  V, 
Z,  U,  V,  W  rather  than  JfO,  ZM,XW,  Y™,  Z <2>  respectively.  Let  X,  Y,  Z,  (J,  V,  W 
be  lXJ,JXK,KXl,  J  X  K,  K  X  I,  I  X  J  mrtrices  respectively  and  let  t,  j,  k  in  the  above 
identities  range  from  0  to  /-  1,  J  -  1  and  K  -  1  respectively.  Then  the  2-Proccdurc  implies  that 
for  arbitrary  natural  /,  ./,  K: 

(r,  J,  K)  ©  (J,  K,  /)  (IJK  +  IJ  +  JK  4-  Kl)  ©(1,1,1). 

The  2-1’rocedure  of  TA  can  be  deduced  from  the  following  table. 
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w 


Table  4.1. 


v*3 


Vjk  *ki 


Ujk  Vkx  Wij 


Wc  will  explain  how  to  define  TA  by  the  following  more  general  tables. 


Table  4.2. 


.0) 

,<«) 

t(i)iO) 

yi0)fc(0 

Zk(  l)m(I) 

.(2) 

(2) 

,(2) 

t(2)i(2) 

Vj(2)k(2) 

Zk(  2)i(2) 

,(r) 

Jr) 

Jr) 

'*(r)j(r) 

Vj{r)k(r) 

Zk(  r)t(r) 

Definition  4.1.  Given  an  r  X  3  table  (Tabic  4.2)  whose  entries  (s,  I), 
the  indetcrminatcs  *$)>(«)>  zfc(i)i(a)>  respectively.  Then  the 

Tabic,  GT.  The  product 


7r(</,  s,  t)  =  !/$)*(*)  4‘(0<(*) 


(s,2),  (s,3)  are  filled  with 
table  is  called  Generating 


(4.1) 


is  called  either  the  s-th  principal  term  of  the  GT  if  q  =  s  —  t  or  the  correction  term  {q,s,t)  of 
the  GT  otherwise.  The  product  £'=l  £»=,  V%)k(,)  £1=1  z%)i(t)  is  callcd  Lhc  aggregate 

of  the  table. 


Table  4.1  is  an  example  of  GT  where  r  =  2,  i(l)  =  i,  j(1)  =  jf,  k(\)  =  k,  i(2)  =  j ,  j( 2)  =  k , 
k(  2)  =  i. 

The  next  result  is  easy  to  verify. 

Lemma  4.1.  Given  Generating  Table  1.2  then  its  aggregate  is  identically  the  sum  of  all  its 
principal  and  correction  terms. 

Hereafter  we  assume  that  the  3 r  subscripts  t(s),  /(«),  k(s),  s  =  in  the  GT  are  integer 

variables  that  independently  of  each  other  range  from  0  to  some  fixed  bounds  /  —l,  J  —  I ,  K  —  1. 
Wc  designate  that 

//  =  IJK .  (4.2) 

Remark  4.1.  We  will  not  use  the  obvious  possibility  to  generalize  our  construction  to  the  case 
where  /  =  /(a),  J  =  ./(a),  K  =  K[s)  depend  on  s  but  H  =  /(.s)./(.s)/C(.s)  does  not  depend  on  s. 

Then  tli ere  exist  I!  instances  of  such  a  GT  and  therefore  //  instances  of  each  principal  term, 
of  each  correction  term,  and  of  the  aggregate  of  that  GT.  The  next  simple  fact  is  important  for 
us. 


Lemma  4.2.  The  sum  of  the  //  instances  of  the  s-fc/i  principal  terms  of  Generating  Table  1.2  is 
identically  the  Tr(X<‘> *<«>/<*>)  where  X <’>  =  (*£»,,>),  =  {y% >)fc(a)),  #<•>  =  (*$,,.(.)),  are 

/  X  J ,  J  X  K ,  K  X  /  matrices  respectively. 
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Corollary  4.1.  Given  II  instances  of  Generating  Tab/c  4.2  (where  (4.2)  holds).  Then 

r  0  <-:(//  +  pc)  0(1,  M)  (4.3) 

where  pc  is  the  rank  of  the  sum  of  the  II  instances  of  all  correction  terms  of  the  GT.  (See  Definition 
2.3  about  the  ranks  of  trilinear  forms.) 

Indeed,  the  sum  of  the  l!  instances  of  the  aggregates  gives  II  0(1, 1, 1).  Subtracting  the  sum 
of  all  instances  of  all  correction  terms  givcs(4.3)  by  virtue  of  Lemmas  4.1  and  4.2.  | 

In  the  sequel  we  combine  Corollary  4.1  with  the  techniques  of  Implicit  Canceling  of  correction 
terms  of  Table  4.2,  see  Section  7. 


5.  Generating  Tables  with  Vectors  as  Subscripts. 

In  this  section  we  combine  the  TPC  and  TA.  Let  m,  n  be  natural  numbers.  Consider  the 
rn-dimcnsional  vector  h  =  (h(  l), . . . ,  h(m))  where  h(g)  are  independent  integer  parameters  that 
range  from  0  to  n  —  1 .  Consider  also  r  different  partitions  of  the  vector  h  into  t(s),  j(s),  k(s), 
3  =  1 .,r,  its  three  disjoint  sub  vectors  of  dimensions  £t  £' ,  £"  respectively  where  f,  £' }  £,f ,  r  are 
fixed  natural  numbers  such  that 

£  +  e  +  e  =  m ,  r  <  m\/(£\  £'\  £"\) .  (5.1) 

Remark  5.1.  Here  and  hereafter  we  assume  that  the  order  of  the  entries  of  a  vector  is  preserved 
for  its  sub  vectors. 

We  will  use  the  following  notation  to  represent  the  s-th  partition  of  the  vector  h , 

h  =  i(3)j{s)k(s)  for  3  =  l, . . . ,  r  .  (5.2) 

i(s)  =  (i(l,»),  t(<,s)  =  h(q{t,sj),  (5.3) 

j(8)  =  (j( I,  «)»•••  »i(C,8)),  j(t',  s)  =  %'(*',«)) ,  t'  =  \ (5.4) 

k(s)  =  (*(1,  «), ... ,  k(l",  s))  ,  k(t",  8)  =  h{q"(t", .«.))  ,  t"  =  I . t' ,  (5.5) 

Since  for  all  s  the  entries  of  z(s),  j(s),  k(s)  coincide  with  some  entries  of  h,  they  arc  also  integer 
parameters  that  range  from  0  to  n  —  1. 

Now  we  establish  the  following  obvious  one-to-one  correspondence  between  the  triplets  of 
vectors  (>(*)»  £(*))  and  integers  (*(*), £(*)), 


*(*)  =  ^2  *(<.  »)n* 1 .  i(«)  =  5Z  8)nt<  1  *  *(«)  =  W'  s)n‘w 1  • 

t  =  i  t'  =  i  t"^i 

This  implies  that  i(s),  j(a),  k(s)  range  from  Oto  /  —  1,  J  —  1,  K  -  1  respectively  where 

/  =  «*,  J  =  ,  K  =  n*"  ,  IJK  =  nm  =  II .  (5.7) 


1 


(Compare  (4.2).) 

Now  we  can  rewrite  Generating  Table  4.2  in  the  following  equivalent  form. 

Table  5.1* 


~(0 

Xi(  <)>(*) 

,0) 

24(i)i(l) 

(2) 

Xi(2)j(2) 

(2)4(2) 

,(2) 

24(2)i(2) 

I(r) 

Z<(  r)j(r) 

Ar) 

Vl(r)k(r) 

Z(r) 

24(r)i(r) 

We  will  consider  Tables  4.2  and  5.1  identical  assuming  that 

(a)  (a)  (a)  (a)  (a)  _  (a)  /r  Q\ 

*£(.)£(«)  —  xi(.3)i(»)  ’  —  y ](*)*(<>)  >  24(*)i(.)  —  2fc(.)«(.)  •  (J-8) 

(See  (5.2)  (5.6).)  Consequcnlly  wo  will  designate  (compare  (4.1)) 

*(«•  s> *)  =  xid)iw  *$»<•>  2fei)i(t)  (5-9) 

and  also  extend  the  definition  of  the  principal  and  correction  terms  and  of  the  aggregate  of  Table  4.1 
as  well  as  Corollary  4.1  to  the  case  of  Table  5.1.  On  the  other  hand,  we  will  exploit  the  vector 
structure  of  the  subscripts  of  the  inde  ter  mi  nates  of  Table  5.1  in  our  next  definition. 

Remark  5.2.  Because  of  the  identities  (5.8)  we  will  not  distinguish  between  the  two  bilinear 
problems  associated  with  Tables  4.2  and  5.1.  In  particular,  we  substitute  (5.7)  in  (4.3)  and  obtain 


r  ©  (n'.n'V")  (nm  +  pc)  ©  (I,  L,  1) . 


(5.10) 


Definition  5.1.  degy  x^lhW  (respectively  deg,  y{°{\)k{a) ,  dog9  2fc(0»(t))>  the  degree  of 

(respectively  of  Ify(i)£(«)*  or  ■Zj^t),/(\)  in  h(g)  is  the  number  of  occurrences  of  the  h(g)  among  the 
entries  of  vectors  i(q),  i(q)  (respectively  j(s),  k(»)  or  k(t ),  £(£))  where  l  <  g  <  m,  l  <  q,a,t  <  r. 
If  n(q,8}t)  (see  (5.9)),  is  a  principal  or  correction  term  of  Table  5 A  then 


dcgff  s,  t)  =  deg,  *|gWfl)  +  dcgg  ifgJ)AW  +  dcgff  zl£(tm 


(5.1 1) 


7r(7,  »,t)  is  a  product  of  degree  I  if  it  has  degree  I  in  h{g)  for  at  least  one  value  g,  l  <  g  <  m. 
The  dimension  of  7r(r/,  s}  t)  is  the  number  of  different  g  such  that  the  degree  of  the  7r (</,,?,£)  in  the 
h(g)  is  positive. 

The  next  simple  estimates  follow  from  the  fact  that  all  of  the  entries  of  the  three  vectors  t(s), 
y(«),  k{s)  arc  different  parameters. 


Lemma  5.1.  ICach  principal  term  of  Table  5. 1  has  degree  2  in  all  h(g),  g  =  1 , . . . ,  rn.  The  degree 
of  each  correction  term  of  Table  5 A  in  any  h(g)  is  at  most  3 . 


The  qpxt  result  follows  from  Definition  5.1  (sec  in  particular  (5.11)).  It  is  important  for  our 
designs  of  fast  MM  algorithms  in  the  next  sections. 


Lemma  5.2.  Let  7r(g,  s,  t)  (sec  (5.9)),  a  correction  term  of  Table  5.1  have  degree  l 
some  g,  q,  a,  t,  1  <  g  <  m,  1  <  g,  s, t  <  r.  Then  the  sum 

in  h(g)  for 

n—  I 

8,  t)  =  n(q,  a, 

h(g)=0 

*) 

(5.12) 

has  rank  l  and ,  more  specifically, 

Pg{q,  s,  <)  =  (  V(j( W)  4(0i(0 

irdeggx§l)iM  =  \, 

(5.13) 

Pg{q,  s,  t)  =  »$)«•>)  4(0i(0 

vfc(9)= o  y 

if  dcgg  y$)kM  =  i , 

(5.14) 

pg(q,  8,  t)  =  4(04(of  X)  4(01(0 ) 

Kh(g)= 0  ' 

if  deg,  =  1 . 

(5.15) 

In  fact,  in  Example  4.1  we  have  already  exploited  the  advantages  given  by  Lemma  5.2  by 
uniting  the  correction  terms  of  Table  4.1  into  the  sum  of  only  IJ  +  JK  -I-  KL  products.  In 
Section  7  we  will  see  some  additional  reasons  to  seek  for  Tables  5.1  whose  correction  terms  have 
degree  1. 


6.  How  to  Design  Generating  Tables  with  Correction  Terms  of  Degree  1? 

In  this  section  we  define  two  constructions  of  large  Generating  Tables  5.1  with  correction  terms 
of  degree  1.  In  Section  7  we  will  exploit  the  latter  property.  We  hope  that  our  constructions  will 
be  eventually  generalized  and  improved.  Wc  will  use  the  following  notation  and  definition. 

Notation  6.1.  A  is  the  empty  (O-dimensional)  vector.  Let  £ ,  0  be  subvectors  of  a  given  vector 
A.  Then  £U©  and  £f|  0,  the  two  subvectors  of  A  are  the  union  and  the  intersection  of  £  and  9 
respectively.  (Then  £  U  0  =  £0  if  £  f|  0  ==  A;  see  Remark  5.1  and  Equation  (5.2).)  h(g)  is  the 
g-th  entry  of  A,  A(g)  6  A. 

Definition  6.1.  The  partitions  of  two  D-dimensional  vectors  £  and  0  into  x  disjoint  subvectors 
and  —  t,...,©x  arc  ,somorPh*c  if  £(p)  €  implies  0(g)  €  for  g  =  1,...,  D,  v  = 

lf-M*- 


Now  we  are  ready  to  describe  our  First  Construction.  Let  a  natural  m  be  a  multiple  of  4, 


and  let  hlf  h2  be  the  two  (2«)-dimcnsional  subvectors  of  the  vector  h  that  consist  of  the  first  2 u 
and  the  last  2 u  entries  of  h  respectively.  Then  write 


r  =  (2u)!/(u!)2  .  (6.2) 

Let  <p(s),  ^(s)  for  s  =  partition  hx  into  pairs  of  disjoint  w-dimcnsional  subveetors. 

Let  <p'(«),  V>;(s)  for  s  =  be  the  isomorphic  partitions  of  h2 .  Then  we  define  z(.$),  j{s)9 

k(s)1  the  vectors- subscripts  of  Table  5.1  as  follows. 

i(8)  =  £(*)»  i(«)  =  ±{s)<^(s) ,  k(s)  =  ^'(s) ,  8=l,...,r.  (6.3) 

Now  Table  5.1  is  defined  by  the  vector  h  and  by  its  r  partitions  into  the  triplets  of  disjoint 
subvectors  (z(s),  y(s),  £(s))  such  that  (6.1)  (6,3)  hold.  This  is  our  First  Construction .  We  call  it 
also  the  r -Procedure  of  TA  for  r  =  (2u)!/(u!)2. 

We  will  use  the  following  result. 


Lemma  6.1.  Let  Table  5.1  be  defined  by  the  r -Procedure  of  TA  for  r  =  (2 u)\/{u\)2  where 
(6.1)  (6.3)  hold.  Then  each  correction  term  n (qfs,t)  of  Table  5.1  has  degree  1, 


VqrVs  30;  dcgg  i r(<7,  s,  t)  =  1  unless  q  =  s  —  t.  (6.4) 

Furthermore  for  each  correction  term  n(q,  s>t)  of  Table  5.1  (see  (4.1),  (5.9)),  and  for  each  g,  1  < 
g  <  rn  cither 

h(g)  e  ft, ,  degg  =  1  (6.5) 

or 

h((j)  e  ft2  ,  dcgg  V^Ms)  =  1  •  (6.6) 


Proof.  Equations  (6.5),  (6,6)  immediately  follow  if  one  examines  the  next  combination  of 
and  (6.3), 

ir('l< 8> 0  =  xg(9),£(9)g'(«)  y±(,)&’ (,),*'(,)  >  <7,  «.  t  =  1, . . . ,  r  . 

We  recall  (see  Notation  6.1)  that 


(5.9) 

(6.7) 


Va:  £(s)V>(s)  =  ft, ,  <p'[a)tf(a i)  =  ft2 

and  that  this  exhausts  all  r  possible  partitions  of  ft,  into  the  disjoint  pairs  of  tx-dimensional 
subvectors  and  also  all  r  isomorphic  partitions  of  ft2.  Hence 


Vr/VsVf:  ^(«)ng(l)^  A  ifs^t,  <p'(q)C\}j/(t)  7^  A  \tq^t. 

It  follows  that  the  dimensions  of  the  vector  V[(<*)U^(t)  (respectively  <p'(q)U  ift'(t))  is  at  most 
2u-  1  and  such  a  vector  is  a  proper  aubvcclor  of  the  (2u)-dimcnsional  vector  ft,  =  <p{q)ip[q)  unless 
a  =  t  (respectively  of  the  ft2  —  £'(s)V>'(s)  unless  q  1).  This  proves  (6.4).  | 


m 


Now  wc  present  our  Second  Construction.  Let  m  be  divided  by  9, 


m  =  9v  (6.8) 

and  let  hit  h2i  be  the  three  (3v)-dimcnsionaI  subvectors  of  h  that  consist  of  the  first  3v,  the 
next  3v,  and  the  last  3v  entries  of  the  vector  h  respectively.  Then  write  that 

r  =  (3t»)!/(w!)3  .  (6.9) 

Consider  all  r  possible  partitions  of  the  vector  h\  into  the  triplets  of  disjoint  v-dimensional 
subvectors  a(s),  P(s)y  7(3),  8  =  l,...,r.  Let  a'(s),  7'(s)  and  a"(s),  s),  7"(s)  be  partitions 

of  h2  an(l  h.3  respectively  that  are  isomorphic  to  the  partition  asy  P^}  of  s  =  1  ,  ...,r. 
Then  define  i(s),  jf<s)y  q(s ),  the  vectors-subscripts  of  Table  5.1  as  follows. 

£(»)  =  8)a'(s)  /?"(»),  i(s)  =  p(s)^(s)a"(8),  k(s)  =  yis)  fi(s)^'(s) ,  s=  l,...,r.  (6.10) 

This  is  our  Second  Construction  of  Generating  Tables  5.1  or  the  r-Procedurc  of  TA  for  r  = 
(3v)!/(v!)3.  Substitute  (6.10)  in  (5.9).  Then  we  obtain 

*(g,  *>  t)  =  Ia(,)o'(,)f(,)^(,)2'(,)a"(,)  y(l(s)  2'(.)ft»(.)a(.)g'(.)2"(«)  4(‘)fi'(*)2"(*).a(t)a'(‘)g"(‘)  ‘  ^ 

Here  1  <  q,  s,t  <  r.  Equations  (6.ll)  will  help  us  to  follow  the  proof  of  the  next  result. 

Lemma  6.2.  Let  Table  5.1  be  defined  by  the  r -Procedure  of  TA  for  r  =  (3v)!/(v!)3;  see  (6.8)- 
(6.10).  Then  each  correction  term  of  dimension  m  of  that  table  has  degree  l. 

Proof.  Let  n(q,  s,t)}  a  correction  term  of  Table  5.1,  have  dimension  m  and  not  have  degree 
1.  Then  the  u-dimcnsional  vectors  7(g),  a(s),  /3(t)  arc  to  be  disjoint.  Indeed,  if  h(<j)  £  7(7)  fla(s) 
then  h(y)  g  a(<7)  U  P(q)  U  P(s)  U  j(s)  U  h2  U  h3.  Hence  the  degrees  of  and  of  in 

h(g)  arc  equal  to  0  (sec  (6.1 1)  and  recall  that  a(6),  /?(6),  7(6)  arc  disjoint  for  all  6,  in  particular, 
for  b  =  qy  b  =  s).  If  the  degree  of  m  kbc  b(g)  is  zero  then  the  dimension  of  7r (q,8,i)  is  at 

most  m  —  1,  otherwise  the  degree  of  n(qy  s,  t)  in  the  h(g)  is  one.  Hence  7(7)  and  a(s)  are  disjoint. 
Similarly  wc  verify  that  a(s)  D  P{t)  =  P(t)  fl  7(7)  =  A.  Hence 

=  At-  (6.12) 

Similarly  we  obtain 

P'(q)(*'(9)  7;(0  =  h2  ,  l"(q)P'{8)a"[t)  =  &3  .  (6.13) 

Since  the  partitions  a(o)y  P(a),  y<x)  of  ht,  cf(o)f  ^(o)  of  h2  and  a"(cr),  £"(er),  2W(*) 

of  ^3  are  isomorphic,  (6.13)  implies  that 

£(<7)a(»):i(0  =  hi, 

3(7)£(*)a(<)  =  hi . 
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(6.14) 

(6.15) 


.  *  if  IJJI  f  II 


Combining  (6.12)  and  (6.14)  implies  that 

2 (<l)  di1)  =  P(<i)  l(l)  •  (616) 

Since  for  all  a  the  vectors  P(o),  7(d)  are  disjoint  and  have  dimension  v,  (6.16)  implies  that 

P{q)  =  P(t) ,  7 (q)  =  2 (t)  ■  (6.17) 

Similarly  (6.12)  and  (6.15)  imply  that  a(s)fl(t)  =  ft{s)a(t)  and  hence 

a(s)  =  «(*).  P{s)  =  ft(t).  (6.18) 

Since  a(<7)/?(a)7(ff)  =  h{  for  all  a,  we  obtain  from  (6.17),  (6.18) 

a{q)  =  a(s)  =  a(t) ,  (3{q)  =  /3(s)  =  p{t) ,  7(^7)  =  7(3)  =  7(<) .  (6.19) 

As  follows  from  the  isomorphism  of  our  partitions  of  hlt  h2l  h %  and  from  (6.19),  n(qf  s,  l)  = 
7r(.v,  j,  ,*)  is  a  principal  term  of  Table  5.1.  This  contradicts  our  assumption  that  n(q,  s,  t)  is  a 
correction  term.  | 


7.  Implicit  Canceling  of  Correction  Terms  of  Degree  1  and  Resulting  Algorithms. 

In  this  section  we  show  how  to  cancel  the  correction  terms  of  degree  I  of  Tabic  5.1  defined  in 
the  two  Constructions  of  the  previous  section. 

At  first,  we  consider  the  following  class  of  linear  transformations  of  bilinear  problems  and 
algorithms. 

Definition  7.1.  Let 

t(x, v,K)  =  ^r »„(£ , r)2„ ,  r(r,r,r)  =  £/v(r.r)v.  (7-u 

rj  rj  • 

two  trilincar  forms  in  X,  K,  Z  and  in  X*,  X*,  Z*  respectively  define  two  bilinear  problems, 

b  =  {«„(£,£)},  B*  =  {»;.(x*,r)}  (7.2) 

respectively.  Let  a  linear  transformation 

x  =  x(x*),  y  =  y(y')>  z  =  z(z*)  (7.3) 

transform  T  into  T* ,  that  is 

r(x(x.‘ ),z(z'),  2(z*))  =  r(2c*,r,z*)  (7.4) 

identically  in  X* ,  K*,  Z."  •  Then  wc  write 

8  =  B(B*),  T=T{V) 
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(7.5) 


and  call  B  and  T  /inear  images  of  B*  and  T*  respectively. 

The  next  illustrative  result  will  not  be  used  in  this  paper. 

Lemma  7.1.  Lei  (7J)-(7.5)  hold  so  that  B  =  B(B *)  is  a  linear  image  of  B* .  Then  (see 
Definition  2.3) 

p(B)  >  (7.6) 

Proof  Substitute  (7.3)  in  a  bilinear  algorithm  (2.6)  of  rank  M  for  the  problem  B.  Then  (see 
(7-4)) 

M 

r(x,y,z)  =  r(r,r*,r)  =  X)  ^(xxan)  //(no:*))  /-"(£(£*)) . 

q=l 

This  (constructively)  defines  a  bilinear  algorithm  of  rank  M  for  B*.  Choose  M  =  p(B)  to  obtain 
(7.6).  | 

It  is  tempting  to  apply  Lemma  7.1  if  one  seeks  upper  estimates  for  p(B*).  Then  it  would 
suffice  to  choose  a  bilinear  problem  fl  of  small  rank  such  that  B  is  a  linear  image  of  B*.  However 
in  the  general  case  we  do  not  have  a  regular  way  for  the  solution  of  the  latter  problem.  (To 
appreciate  its  difficulty,  try,  for  instance,  to  find  a  linear  transformation  which  would  show  that 
B  =  B(B*)  in  the  case  B*  =  B  is  the  PM  problem  defined  by  (2.4)  where  p  =  q  =  m2, 

p(B)  ==  p+q—  1  =  2m*2  —  1.  If,  contrary  to  our  intuition,  such  a  transformation  existed  then  (7.6) 
would  imply  that  />((m,m,  m))  =  2m2  —  1,  see  (2.10).) 

Thus  we  prefer  not  to  use  Lemma  7.1.  Instead,  we  will  seek  for  linear  transformations  that 
reduce  the  rank  of  the  original  algorithms  generated  by  Table  5.1  by  canceling  the  correction  terms 
of  degree  1.  We  call  such  transformations  by  Implicit  Canceling  (IC)  and  the  whole  process  that 
consists  of  the  choice  of  Tables  5.1  and  of  IC  by  Trilinc ar  Aggregating  with  Implicit  Canceling 
(TAIC);  see  [13]. 

Transformation  (7.3)  can  be  considered  a  triplet  of  transformations  applied  to  X,  K,  Z_ 
separately  of  each  other.  In  the  sequel  we  apply  the  transformation  (7.3)  to  the  problems  B  = 
©{^(*<#)>  «/(*)»  K(s))'  1°  s,|ch  cases  we  compose  (7.3)  of  r  triplets  of  linear  transformations 
of  X_*^3\  Into  X}9\  Z_^  for  all  s,  s  =  1  ,...,r.  To  simplify  the  notation,  we 

delete  the  superscripts  s  and  consider  transformations  of  the  triplets  (X*,y*,X*)  into  (X,  H,  Z) 
and  of  the  trilinear  form 

T(X,Y,Z)  =  Tr(XYZ)=  £  Xijy^ZK  (7.7) 


into  another  one, 


T(r,Y\£)  =  Tr (X*YW)  =  £  *&&**&• 


(Recall  Remark  5.2.) 

Mere  i  =  (t'(l) . *(0)»  i  =  0(0 . i(0)>  A  =  (A( I k(t"))  (compare (5.2)  (5.4)).  The 

relation  (i.^iA)  €  D  (respectively  (i,ji,fc)  G  D *)  under  the  sign  52  designates  the  summation  in 


22 


m 


< 


i(l),  ...,t(l),  ■  .,&(/")  from  0  (respectively  1)  to  n  —  1.  The  latter  comments 

also  define  two  domains,  D  and  D*  where  the  i,  £9  k  range. 

The  trilincar  forms  of  (7.7),  (7.8)  define  the  problems  {I,J,K)  and  ( *)  respectively 


where 

/  =  nl ,  J  =  n*' ,  K  =  ne' ,  II  =  IJK  -  nm  .  (7.9) 

/*  =  (n-  1)*,  J*=(n-1)<',  K*  =  (n  —  l)4”  ,  //*  =  VJ*K *  =  (n  -  1)"*  .  (7.10) 

Here  is  one  of  possible  linear  transformations  of  (7.7)  into  (7.8). 

xii  =  xh’  Vik  ~  V'ik  >  zki  =  zki  for  6  D* ,  (7.11) 

n— 1 

Zki  =  o  if  k(t")  =  0,  2/jfc  =  0 »  (7-12) 

fc(t")=0 
w—  I 

Vjk  =  0  ifi(0  =  0,  J2  =  (7-13) 

j(t')=o 

n— 1 

z*i  =  0  if *(0  =  0,  *i»  =  °»  (7-14) 

*(0= o 


We  assume  that  all  unbounded  entries  of  £,  j,  k  that  are  used  in  (7.12)  (7. M)  range  in  the  domain 
D  and  that  t ,  t1 ,  tu  range  as  follows,  tn  =  1,...,£"  in  (7.12),  tf  =  1,  •••,£'  in  (7. J 3),  and  t  = 
1  ,...,*  in  (7.14). 

Equations  (7.11)  (7.14)  contain  some  implicit  expressions  of  and  y y*  as  linear  functions 
of  11*-  To  make  them  explicit,  rewrite  the  second  equations  of  (7.12)  (7,14)  so  that  for  each 
triplet  t ,  t* ,  J"  all  indeterminates  are  moved  to  the  right  sides  except  the  following  ones  which 
remain  in  the  left  sides, 

yjk  where  fc(t")  =  0  in  (7. 1 2), 

Xij  where  j(tf)  —  0  in  (7.13), 

Xij  where  i(t)  =  0  in  (7.14). 

Then  substitute  (7.11)  in  the  right  sides. 

Now  appiy  a  variation  of  the  linear  transformation  (7.11)  (7.14)  to  each  of  the  r  triplets 
X  =  X}*\  H  —  2L  =  a  =  t, . . . ,  r  of  indeterminates  of  Table  5.1  defined  by  our  First 
Construction  of  Section  6.  In  that  variation  preserve  (7.11)  (7.14)  for  all  f,  t9f  and  also  for  all 
tf  <  n  (then  j(tf)  G  6i)-  If  t9  >  «  (then  j(t')  G  h2)  substitute  the  following  equations  for  (7.13), 

n  —  I 

*ii  =  oifi(q  =  o,  vik  =  °>  *'>« •  (7-15) 

Notice  that,  by  virtue  of  Lemmas  5.2,  6.1,  the  above  transformation  applied  to  the  First 
Construction  of  Section  6  cancels  all  correction  terms  of  Table  5.1.  This  gives  us  the  following 
estimate;  sec  (5.10),  (6.2),  (7.9),  (7.10). 
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Theorem  7.1.  For  arbitrary  natural  numbers  u  and  n, 


((2u)!/(«!)2]  0  <(n  -  l)",(n  -  1)2“,(«  -  1)")  n4“  ©  (1, 1,1}  (7.16) 

We  will  call  the  transformation  (7.1 1)~(7.15)  the  First  Transformation  for  Implicit  Canceling . 
The  associated  equation  of  (7.16)  for  a  fixed  n  and  sufficiently  large  u  implies  the  following  estimate 
(see  Theorem  3.1). 

Corollary  7.1.  For  arbitrary  natural  n,  0(n)  =  3(2  log  n  —  log  2)/21og(n  —  1)  is  a  limiting 
exponent  of  MM,  in  particular,  0(9)  <  2.67  is  a  limiting  exponent  of  MM, 

Next  we  define  our  second  linear  transformation  which  also  transforms  (7.7)  into  (7.8)  and 
enables  us  to  cancel  all  correction  terms  of  degree  1  in  any  Table  5.1. 

We  define  this  transformation  recursively  in  rn  steps.  With  each  step  we  associate  a  new  value 
of  t",  t 9  or  t.  For  instance,  we  can  successively  choose  tn  ~  1  ,  then  tf  =  then 

t=  l, 

Here  is  the  first  step  of  the  transformation  in  the  case  £ff  =  1  where  we  designate  A:  ==  A:(  1 )  = 
k, 

ViVj:  =  (7.17) 

VjVfc(M  0):  yjk  —  Vjk  •  (718) 

n—  1  n  — 1 

V£Vj:  yih  ~  Zhl-  =  O'  (719) 

h=Q  "  h=0 

n— 1 

ViVfc  (fc  ^  0):  zki  +  ^2  zhi  =  zki .  (7.20) 

fe=i 

Equations  (7.19)  contain  implicit  expressions  of  yj0f  z^i  through  z*ki>  k  =  l,...,n—  1} 

which  can  be  easily  turned  into  explicit  ones.  Similarly  Equations  (7.20)  implicitly  express  Zki  as 
linear  function  of  zki  for  k  =  I,...,n  —  1.  To  obtain  the  explicit  expressions,  we  have  to  solve 
(7.20)  over  F  for  each  z  as  a  system  of  linear  equations  in  Zki,  k  =  l,...,n  —  l.  The  next  simple 
result  shows  that  the  solution  exists  if  n  ^  0  in  F. 

Lemma  7.2.  For  each  i  the  determinant  of  the  system  of  Equations  (7.20)  in  2w,...»£n- i,»  is 
equal  to  n. 


Next  we  examine  how  the  transformation  (7.17)  (7.20)  change  the  trilinear  form  T(X_,  Y_,X). 
We  write  that 


n- 1 


T  =  T(x,rtz)  =  EE  y^k 

id  fc=o 


— ’  xij\  Zk±  2°i )  * 

hj  =  i  ' 


2i 


Substitute  y Vjk  (see  (7.10))  and  obtain 

n  — 1 

T  ~  yik(Zki  ~~  2°i)  • 

i,i  k  =  l 

Then  substitute  z0t  =  —  Sk~1i  zhi  (see  (7.19)).  This  gives 

T  =  E  E  ***(*« +  E  **)  •  (*■«) 

i,y  *=i  ^  /i= i  ' 

Substitute  (7.17),  (7.18),  (7.20)  in  (7.21)  and  obtain 

n— 1 

t  =  t(x_, r,z)  =  53  £  2/ik  »**  =  T(x* , r , r ) . 

fc=i 

We  come  to  the  following  result. 

Lemma  7.3.  For  arbitrary  i ,  £f ,  n  (n  0  in  F)  the  linear  transformation  (7.l7)-(7.20)  transforms 
(n*,n*,n)  into  (nl,  n^,n—  1). 

In  the  case  tn  >  1  we  can  generalize  (7.17)  (7.20)  using  the  following  notation. 

Notation  7.1.  Delete  the  entry  k(t ")  of  the  vector  k.  Designate  the  resulting  vector  by  £(<"). 
Designate  k  =  £(£")&(£")  in  the  case  where  all  entries  of  k  are  considered  integer  parameters.  If 
the  value  of  k(tft)  is  fixed,  k(t,f)  =  h  and  if  other  entries  of  k  are  parameters,  designate  k  =  k(tf,)h. 

Then  the  transformation  (7.17)  (7.20)  can  bo  generalized  to  the  case  tn  >  1  where  t"  is  fixed, 
1  <  tn  <  tn .  Let  (7.17)  be  preserved  and  the  following  equations  substitute  for  (7.18)  (7.20). 

Vi  vk(t")  Vfc(t")  (*(<")  ^  0):  yyfc  =  y\k .  (7.22) 

n— 1  n— 1 

53  y},k(t")h  —  53  zk(t")h,i  —  0. 

h—  0  >1=0 

n-l 

ViVjVfc(<")  (*(£")  ^  0):  +  53  *i(t«)fcli  = 

h=i 

Remark  7.1.  ir  z'k(<,")k,i  ~  0  an'1  (7-2/|)  ho,<ls  ,  ,icn  EI=o  4(,")k,i  =  0  lor  any  q" ,  1  < 

q"  <  t",  q"  fTP. 

Then  similarly  to  Lemma  7.3  the  following  result  can  be  obtained. 

Lemma  7.4.  For  arbitrary  l,  t,  lff ,  n  (n  ^  0  in  F)  the  linear  transformation  (7.l7)f  (7.22)  (7.24) 
transforms  (n*,n*\n*”)  into  1)}.  Similarly  (n*,n*\n*”)  can  be  transformed  into 

{nifn*~~x(n  —  l),n*”)  and  into  (n*“!(n  —  \)fn* ,1%*”). 


(7.23) 

(7.21) 


Recursively  applying  the  three  latter  transformations  we  obtain  the  desired  linear  functions 
(7.3)  that  for  arbitrary  n  7^  0,  i,  f! ,  ln  transform  (n*,n*  ,  n*’  )  into  ((n  —  l)*,(n  —  1)*  ,(n  —  l)*  ). 
We  call  such  a  process  the  Second  Transformation  for  Implicit  Canceling.  Its  efficiency  stems  from 
the  following  fact  which  can  be  easily  verified  using  Remark  7.1  and  similar  observations. 

Lemma  7.5.  Let  functions  (7.3)  define  the  Second  Transformation  for  Implicit  Canceling .  Then 
(7.23)  holds  for  all  tn ,  tn  —  1, . . . ,  tn  as  well  as  the  following  equations . 

n— l  n— 1 

VjVkVi(t):  xi(t)h<i  =  £  t)h  =  0 ,  t  =  (7.25) 

>1=0  h=0 

n  l  n —  1 

Vfc  VtVj(C);  V  =  V;  y,(e0fc,4  =  0  ,  t'  =  1,  ,  (7.26) 

— q  h—Q 

Corollary  7.2.  Let  the  Second  Transformation  for  IC  be  applied  to  an  arbitrary  Table  5.1  then 
liquations  (7.23),  (7.25),  (7.26)  cancel  all  correction  terms  of  degree  /. 

(Corollary  7.2  follows  from  Lemmas  5.2,  7.5.) 

In  particular,  if  Table  5.1  is  defined  by  the  First  Construction  of  Section  6  then  all  correction 
terms  of  Table  5.1  are  canceled.  This  gives  another  proof  of  (7.16)  (for  n^O  in  F).  If  Table  5.1 
is  defined  by  the  Second  Construction  of  Section  6  then  only  the  correction  terms  of  dimensions  at 
most  m—  I  are  not  canceled  by  the  Second  Transformation  for  IC.  This  gives  the  following  result. 

Corollary  7.3.  Tor  arbitrary  field  F  and  natural  v,  n  (n  0  in  F) 

O  ((»  -  1  )3” ■  (»  -  1  )3” .  (»  -  1 H  (n*V  +  ^ )  O  ( '  -  I .  ' )  • 

where  pc *  is  the  rank  of  the  sum  of  all  instances  of  all  correction  terms  of  Table  5.1  transformed 
by  the  Second  Transformation  for  IC.  Here  Table  5.1  is  defined  by  the  Second  Construct  ica  of 
Section  6. 

Our  next  objective  is  the  following  estimate. 

Lemma  7.6.  Under  the  conditions  of  Corollary  7.3, 

n0v  +  pc‘<{n  +  1)9v.  (7.27) 

Proof.  Let  Table  5.1  be  defined  by  the  Second  Construction  of  Section  6.  Let  the  Second 
Transformation  for  IC  be  applied.  Then  for  all  consider  all  possible  sets  of  different  integers 

G  =  {ffi ,  — , ffM}  ,  !<<;»,  <9t>,  V  =  n  =  0,  l,...,9v.  (7.28) 

.  liCt  one  of  such  sets  he  fixed.  Substitute  zeroes  for  each  indeterminate 

in  Table  5.1  unless  such  an  indeterminate  has  degree  zero  in  /i(^)  for  //  =  l,...,/i.  Call  the 


resulting  table  by  the  Auxiliary  Table  associated  with  the  set  {gt, . . . ,  gM}.  (Talue  5.1  itself  is 
associated  with  the  empty  set.)  Notice  that  for  p  >  l  all  principal  terms  of  all  Auxiliary  Tables 
are  zeroes. 

Multiply  the  aggregate  of  the  Auxiliary  Table  associated  with  the  set  {<71, . . .  ,<7^}  by  (— n)M. 
Sum  the  results  for  all  values  of  all  entries  h(g)  E  h  such  that  g  {ffj , . . . ,  <7M}  and  for  all  possible 
sets  {ffi,. . .  i0M},  p  =  0,  l,...,9v.  As  can  be  verified,  no  correction  terms  of  dimensions  less 
than  m  remain  in  the  resulting  total  sum.  lienee  the  sum  is  identically  T(X ,  K,  Z)  because  the 
correction  terms  of  dimension  m  arc  canceled,  by  virtue  of  Lemmas  5.2,  6.2,  7.5.  It  remains  to 
estimate  n9v  4-  pc* ,  the  rank  of  the  sum  of  all  instances  of  all  aggregates  in  all  of  our  Auxiliary 
Tables  in  order  to  prove  (7.27).  (This  whole  procedure  for  canceling  the  terms  of  dimensions  less 
than  m  is  general.  It  can  be  called  the  Alternating  Summation  of  Aggregates .) 

The  desired  upper  estimate  (7.27)  can  be  obtained  from  the  next  two  simple  lemmas. 

Lemma  7.7.  For  a  natural  p,  0  <  p  <  9v,  and  for  an  Auxiliary  Table  associated  with  a  set 
{</i, . . . ,  <7m}  (sec  (7.28))  there  exist  at  most  n9v~fl  instances  of  the  aggregate  of  that  table . 


Lemma  7.8.  For  an  arbitrary  natural  pf  to  <  p  <  tov,  there  exist  at  most  (9^)  =  (9u)/(/i!(9v-/i)!) 
different  sets  {<7i,  - . . ,  <7M}  where  gv  are  natural  numbers,  1  <  gn  <  tov. 


Corollary  7.4.  For  arbitrary  field  of  constants  F  and  for  all  natural  v,  n,  (n  0  in  F),  the 
following  mapping  is  valid. 

O {(»  -  i)3", (» -  i)3u, (« -  i)3w)  «-(»  +  09v O (M, i) . 

The  associated  equations  for  a  fixed  n  and  for  v  — ►  oo  define  the  limiting  exponents  of  MM, 

/T(n)  =  log((n  +  l)3/.'*)/  log(n  -  l) ,  (7.29) 


in  particular, 


/T(20)  <  2.7288. 


8.  Conclusions. 

How  can  the  techniques  or  the  previous  sections  be  improved?  One  or  the  natural  ways  is  to 
improve  the  Constructions  of  Section  6. 

Corollary  7.2  enables  us  to  cancel  all  correction  terms  of  degree  1.  The  method  of  the 
Alternating  Summation  of  Aggregates  (see  the  proof  of  Lemma  7.6)  can  be  generalized  for  canceling 
the  terms  of  dimensions  less  than  m.  It  remains  to  design  Generating  Table  5.1  where  all  correction 
terms  of  dimension  m  would  have  degree  1  in  some  of  the  h(g).  We  proved  such  a  property  for  the 
Second  Construction  of  Section  6.  The  proof  and  hence  the  result  itself  can  be  extended  to  any 
Table  5.1  such  that  the  vectors  of  subscripts  k(q)}  £(#),  £(t)  arc  disjoint  only  if  q  =  s  =  t. 
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Is  it  possible  to  obtain  Tabic  5.1  with  r  rows  where  the  latter  property  holds  and  where 
3(m!og  n  —  log  r)/mlog(n  —  1)  is  substantially  less  than  in  (7.29)?  (See  (5.1),  (7.9),  (7.10), 

(7.29).) 

Here  is  another  way  that  seems  to  be  more  promising.  One  can  generalize  Tables  5.1  by 
turning  them  into  the  following  ones  which  we  call  Generating  X-7a blcs.  (We  represent  only  the 
s-th  row  of  the  tables,  assuming  that  s  = 

Table  8,1. 


a(s,  X)x 


M 

*(*)>(*) 


/*(*>% 


(a) 


7(s>  >04?.)i(a) 


Here  ct(s,  X),  /?(s,X),  7(5,  X)  are  constants  of  F  such  that 

Vs:  ^  a(s,  X)  /J(s,  X)  7(3,  X)  =  1 . 

X 

We  assume  that  the  aggregates  of  Table  8.1  arc  to  be  summed  for  all  values  of  X.  (In  particular, 
if  X  is  a  constant  and  a(s,  X)  =  j3(s,  X)  =  7(3,  X)  =  1  for  all  s,  then  we  come  back  to  "Fable 
5.1.)  The  coefficients  r*(s,  X),  /}(s,X),  7(s,  X)  can  be  chosen  such  that  several  correction  terms  are 
canceled  in  the  result  of  the  summation  in  X,  More  precisely,  it  is  sufficient  to  satisfy  the  equation 

^at(<7,X)/?(»,X)7(<,X)  =  0  (8.1) 

X 

in  order  to  cancel  the  correction  term  ?rx(<7>s,£), 

*  =  °-  (8-2) 

\ 

In  particular,  in  some  cases  this  observation  enables  us  to  cancel  even  the  correction  terms  whose 
degrees  in  all  h(g)  are  greater  than  1  (if  such  terms  appear  in  Table  8.1). 

In  fact,  such  a  trick  was  successfully  applied  in  (3,  12]  under  the  name  Trilinear  Canceling 
(see  also  [9]).  On  the  other  hand,  the  Generating  X -Tables  can  be  used  to  define  X-algorithms 
for  MM  which  turn  out  to  coincide  with  APA-algorithms  if  a(s,X),  /?(s,X),  7(tf,X)  are  rational 
functions  of  X  and  if  the  consideration  is  modulo  X.  In  such  a  setting  the  application  of  (8.1), 
(8.2)  as  a  means  of  canceling  is  generally  efficient.  This  is  formally  proven  in  the  basic  theorem 
on  the  relations  between  usual  algorithms  and  APA-algorithms.  (Such  an  interpretation  of  the 
theorem  can  be  seen  from  the  original  illuminating  proof  given  in  [6]  and  repeated  in  neither  of 
the  papers  [7  10,  17].)  During  the  study  of  APA-algorithms  this  direction  has  remained  in  the 
shadows.  However  regarding  the  relationship  between  APA-algorithms  and  X-Tablcs  the  approach 
of  [O]  seems  important  and  might  become  fruitful  in  the  future. 

In  particular,  it  is  important  to  understand  the  most  efficient  ways  of  canceling  the  correction 
terms  of  Generating  X-Tablcs.  It  might  happen  that  the  existent  methods  already  rely  on  nearly 
optimum  ways  of  such  a  canceling.  However  because  of  the  extreme  irregularity  or  the  asymptoti¬ 
cally  fastest  known  algorithms  for  MM  wc  might  be  far  from  understanding  the  successful  methods 
of  canceling  hidden  in  those  algorithms.  Then  further  efforts  in  the  analysis  of  the  best  existent 
methods  of  MM  can  become  fruitful. 
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